- factor analysis
- A family of statistical techniques for exploring data, generally used to simplify the procedures of analysis, mainly by examining the internal structure of a set of variables in order to identify any underlying constructs. The most common version is so-called principal component factor analysis.In survey data, it is often the case that attitudinal, cognitive, or evaluative characteristics go together. For example, respondents who are in favour of capital punishment may also be opposed to equality of opportunity for racial minorities, opposed to abortion, and may favour the outlawing of trade unions and the right to strike, so that these items are all intercorrelated. Similarly, we might expect that those who endorse these (in the British context) right-wing political values may also support right-wing economic values, such as the privatization of all state-owned utilities, reduction of welfare state benefits, and suspending of minimum-wage legislation. Where these characteristics do go together, they are said either to be a factor, or to load on to an underlying factor-in this case with what one might call the factor ‘authoritarian conservatism’.Factor analysis techniques are available in a variety of statistical packages and can be used for a number of different purposes. For example, one common use is to assess the ‘factorial validity’ of the various questions comprising a scale , by establishing whether or not the items are measuring the same concept or variable. Confronted by data from a battery of questions all asking about different aspects of (say) satisfaction with the government, it may be that individual items dealing with particular economic, political, and social policies, the government's degree of trustworthiness, and the respondent's satisfaction with the President are not related, which suggests that these different aspects are seen as conceptually distinct by interviewees. Similarly, for any given set of variables, factor analysis can determine the extent to which these can be reduced to a smaller set in order to simplify the analysis, without losing any of the underlying concepts or variables being measured. Alternatively, researchers may ask respondents to describe the characteristics of a social attribute or person (such as ‘class consciousness’ or ‘mugger’), and factor-analyse the adjectives applied to see how the various characteristics are grouped.All these uses are ‘exploratory’, in the sense that they attempt to determine which variables are related to which, without in any sense testing or fitting a particular model. Consequently, as is often the case in this kind of analysis, researchers may have difficulty interpreting the underlying factors on to which the different groups of variables load. Some marvellously imaginative labels have been devised by sociologists who have detected apparent underlying factors but have no clear idea of what these higher-order abstractions might be. Less frequently, however, a ‘confirmatory’ factor analysis is undertaken. Here, the researcher anticipates that a number of items measuring (say) ‘job satisfaction’ all form one factor, and this proposition is then tested by comparing the actual results with a solution in which the factor loading is perfect.Alternative criteria exist for determining the best method for doing the analysis, the number of factors to be retained, and the extent to which the computer should ‘rotate’ factors to make them easier to interpret. An ‘orthogonal rotation’ yields factors which are unrelated to each other whereas an ‘oblique’ rotation allows the factors themselves to be correlated; and, as might be expected, there is some controversy about which procedure is the more plausible in any analysis. Although there are conventions about the extent to which variables should correlate before any are omitted from a factor, and the amount of variance (see variation ) to be explained by a factor before it may be ignored as insignificant, these too are matters of some debate. The general rule of thumb is that there should be at least three variables per factor, for meaningful interpretation, and that factors with an ‘eigenvalue’ of less than one should be discarded. (The latter quantity corresponds to the percentage of variance, on average, explained by the equivalent number of variables in the data, and is thus a standardized measure which allows researchers to eliminate those factors that account for less of the variance than the average variable.) However, even when a factor has an eigenvalue greater than 1, there is little to be gained by retaining it unless it can be interpreted and is substantively meaningful. At that point, statistical analysis ceases, and sociological theory and imagination take over. Moreover, the correlation matrix which is produced for the variables in any set and which yields the data from which factors are extracted, requires for its calculation variables which have been measured at the interval level and have a normal distribution . The use of the technique is therefore often accompanied by disputes as to whether or not these conditions have been met. For a useful introduction by a sociologist see, ‘Factor Analysis’, in , Encyclopedia of Sociology (1992). See also measurement ; personality ; scree test.
Dictionary of sociology. 2013.